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The large scale anisotropies of WMAP data have attracted a lot of attention and have been a 
source of controversy, with many of favourite cosmological models being apparently disfavoured by 
the power spectrum estimates at low I. All of the existing analyses of theoretical models are based 
on approximations for the likelihood function, which are likely to be inaccurate on large scales. 
Here we present exact evaluations of the likelihood of the low multipoles by direct inversion of 
the theoretical covariance matrix for low resolution WMAP maps. We project out the unwanted 
galactic contaminants using the WMAP derived maps of these foregrounds. This improves over 
the template based foreground subtraction used in the original analysis, which can remove some of 
the cosmological signal and may lead to a suppression of power. As a result we find an increase in 
power at low multipoles. For the quadrupole the maximum likelihood values are rather uncertain 
and vary between 140-220/iK 2 . On the other hand, the probability distribution away from the 
peak is robust and, assuming a uniform prior between and 2000/^K 2 , the probability of having 
the true value above 1200/iK 2 (as predicted by the simplest ACDM model) is 10%, a factor of 
2.5 higher than predicted by WMAP likelihood code. We do not find the correlation function to 
be unusual beyond the low quadrupole value. We develop a fast likelihood evaluation routine that 
can be used instead of WMAP routines for low £ values. We apply it to the Markov Chain Monte 
Carlo analysis to compare the cosmological parameters between the two cases. The new analysis 
of WMAP either alone or jointly with SDSS and VSA reduces the evidence for running to less 
than 1-cr, giving a a — —0.022 ± 0.033 for the combined case. The new analysis prefers about 1-er 
lower value of Q m , a consequence of an increased ISW contribution required by the increase in the 
spectrum at low £. These results suggest that the details of foreground removal and full likelihood 
analysis are important for the parameter estimation from WMAP data. They are robust in the 
sense that they do not change significantly with frequency, mask or details of foreground template 
marginalization. The marginalization approach presented here is the most conservative method 
to remove the foregrounds and should be particularly useful in the analysis of polarization, where 
foreground contamination may be much more severe. 

PACS numbers: 98.70.Vc 



I. INTRODUCTION 



Data analysis of cosmic microwave background maps 
is a challenging numerical problem. The question that 
we want to answer is the probability (or likelihood) of 
a theoretical model given the data. In order to evalu- 
ate the exact likelihood of a theoretical power spectrum 
of CMB fluctuations given a sky map of these fluctua- 
tions it is necessary to invert the theoretical covariance 
matrix. This operation scales as 0(N 3 ), where N is the 
length of the data vector and is currently limited by prac- 
tically available computer technology to N < 10 4 . One 
is hence forced to use approximate estimators when in- 
ferring the power spectrum from data such as WMAP 
satellite Q, which have 1-2 orders of magnitude more 
independent measurements. The most popular methods 
are the pseudo-Cl (PCL) method (see e.g. 0) and the 
Quadratic Maximum Likelihood (QML) estimator (see 
e.g. |3) • Both of these methods produce as an inter- 
mediate step estimates of multipole moments Ct and ap- 
proximate methods have been developed to describe their 



probability distributions as accurately as possible |j, |fj . 
These perform satisfactorily for high £ values, where the 
central limit theorem guarantees a Gaussian distribution 
(in offset lognormal transformed variables) will be a good 
approximation. Unfortunately, these methods are much 
less reliable at low multipoles, where the distributions 
are not Gaussian. The situation is complicated further 
by the masks applied to the data to remove the galactic 
foreground contamination and by the marginalization of 
unwanted components, all of which makes analytic ap- 
proach unreliable. In [6j it was suggested to use a hybrid 
approach using QML on degraded maps at low £ and 
PCL at higher multipoles. 

The issue of the exact values of multipole moments in 
WMAP data has attracted much attention since the orig- 
inal analysis by WMAP team 0. Several unusual fea- 
tures have been pointed out already in the original anal- 
ysis. One of these was the correlation function, which 
appears to almost vanish above 60°. Another was the 
low value of the quadrupole. With the PCL analysis the 
value of the quadrupole was found to be ~ 123/xK 2 , com- 
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pared to the expected value of ~ 1200/iK 2 for the sim- 
plest ACDM model. The probability for this low value 
was estimated to be below 1%, depending on the param- 
eter space of models. The discussion of the statistical 
significance of the low values of q uadrupole and octopole 
in the WMAP data H GU [TJ has sparked a renewed 
interest into the so called estimator induced variance [T^ 
- the error in the likelihood evaluation arising due to the 
use of an estimator rather than the exact expression. In 
[l2| it has been argued that [QML estimator performs 
significantly better than the PCL estimator and that the 
true value of the quadrupole probably lies in the range 
around 170 — 250/iK 2 . However, only the maximum likeli- 
hood value was computed and not the full likelihood dis- 
tributions so the statistical significance of this result and 
its effect on the parameter estimation remained unclear. 
In addition, the role of foregrounds and monopole/dipole 
removal has not been explored in detail. 

In this paper we take a different approach. We argue 
that the actual value of the best fitted quadrupole (and 
other multipoles) is not of the main interest, since it can 
be quite sensitive to the details of the foreground sub- 
traction procedure, type of mask used and numerical de- 
tails of the analysis (in fact, the various values proposed 
so far may even be statistically indistinguishable if the 
likelihood function at the peak is very broad). What is 
more important is the probability or likelihood of a model 
given the data, compared to another model that may, for 
example, fit the data better. This is encapsulated in the 
likelihood ratio between models and within the Bayesian 
context is the only information we really need to asses 
the viability of cosmological models that belong to a cer- 
tain class. In this paper we perform the exact likelihood 
calculation by a direct inversion of the covariance matrix 
for the low resolution maps, thus eliminating all the un- 
certainties related to estimator variance approximations. 
Since we use low resolution maps with less than 3000 
pixels we can do the inversions with a brute force linear 
algebra routines. This means we cannot do the analysis 
on all of the multipole moments, so we analyse low mul- 
tipoles with the exact method and use PCL analysis for 
the higher multipoles, where the two methods agree with 
each other and where the approximate variance estimates 
developed for PCL analysis are likely to be valid. 

Second issue we wish to address in this paper is the 
question of foreground subtraction. This is done in two 
steps. First, pixels with high degree of contamination 
are completely removed from the data. This results in 
the so called KP2 (less aggressive, 85% of the sky) and 
KPO (more aggressive, 75% of the sky) masks [l3j. There 
remains contamination even outside these masks in indi- 
vidual frequency channels. This contamination can be 
further reduced using templates and/or frequency infor- 
mation [l^. In WMAP analysis the templates were fit- 
ted for and subtracted out of WMAP data. Even with 
a perfect template there is a danger that this procedure 
can oversubtract the foregrounds, since one is essentially 
subtracting out the maximum amplitude consistent with 



the template which could include some of the signal. In- 
stead, here we do not subtract out the templates, but 
marginalise over them by not using any information in 
the data that correlates with a given template. This pro- 
cedure has not been applied to WMAP data in previous 
analyses. It guarantees that there is no statistical bias 
caused by the foreground removal. 

Some of the templates that were subtracted in WMAP 
analysis, particularly 408MHz Haslam synchrotron radi- 
ation map [3| > are of poor quality. WMAP produced 
a better set of templates applying Maximum Entropy 
Method (MEM) to WMAP maps in several frequency 
channels using templates as priors only |13| . In addition 
to the Haslam synchrotron map, they used |Tfij H-a map 
as a tracer of free-free emission and the SFD dust tem- 
plate based on [l^. This process resulted in three MEM 
derived foreground maps. These, however, were not used 
to infer the power spectrum. Instead, the official power 
spectrum was determined from the integrated single fre- 
quency maps and the same templates that were used as 
priors for the MEM map making procedure, ignoring the 
MEM derived maps. 

The MEM derived maps are likely to be the most 
faithful representation of the foregrounds. When used 
in power spectrum inference, however, they must be 
used with care due to complicated nature of their signal 
and noise correlations [l^. Nevertheless, on the largest 
scales, where receiver noise is negligible, they are proba- 
bly the best available option. We therefore use the inte- 
grated single channel maps and the MEM derived fore- 
ground templates as a basis of our work. Note that in 
foreground marginalization procedure no template is ac- 
tually removed from the data and there is no danger of in- 
troducing noise correlations that could significantly affect 
the power spectrum estimates. We perform this process 
on foreground unsubtracted maps of the V and W chan- 
nels of the WMAP satellite. We use both KP2 and KPO 
masks and project out the remaining galactic contamina- 
tion using MEM inferred maps of dust, synchrotron and 
free-free foregrounds. We use the likelihood evaluated in 
this way to asses the statistical significance of departure 
from the concordant model at low multipoles and to per- 
form the statistical analysis of cosmological models given 
the data. 

WMAP team also produced the so called Internal Lin- 
ear Combination (ILC) map of the CMB emission, by 
using internal maps at various frequencies to decompose 
them into CMB and foreground components. This ap- 
proach is not based on any templates and so uses less 
information than in principle available. While visually 
these maps appear to be relatively free of contamina- 
tion outside the galactic plane, there are still artifacts 
within the plane. This means that one must be careful 
when projecting out monopole and dipole: one should 
not simply remove them from the all-sky map, since they 
could be contaminated by galactic emission at the canter 
and this would leave a residual offset outside the galac- 
tic plane, which could contaminate all of low multipoles. 
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One must again apply the marginalisation over monopole 
and quadrupole on the masked map to eliminate any con- 
tamination in the final result. A similar approach has 
been taken by 01 an d 0] , who produced their own ver- 
sions of ILC maps. Since we argue that the best method 
is to use single frequency maps together with correct tem- 
plates and we use ILC map for illustration and cross- 
check purposes only, we do not consider these alternative 
ILC solutions further. 



where d is the data vector. To evaluate the likelihood of 
a given theoretical model we simply evaluate this expres- 
sion, computing the covariance matrix using the theoret- 
ical model spectrum Cg in equation ^ 



B. Choice and preparation of maps 

II. METHOD 



A. Likelihood evaluation 

Given noise-less and independent measurements of the 
CMB sky d, the theoretical covariance matrix for these 
measurements is given by |l9j 



(i) 



where Cg is the power spectrum, Pg is the Legendre Poly- 
nomial of order £ and Qi.j is the angle between ith and 
jth point on the sky. We also define 



C e £(£ + 1) 
2tt 



(2) 



which is the quantity that is conventionally plotted (and 
often referred to) as the power spectrum. 

In addition to the covariance matrix in equation^ we 
want to project out linear components of the data vec- 
tor that correspond to known contaminants in our data. 
Fortunately, there exist a standard procedure for this 
|20| : the covariance matrix of the contaminant is cal- 
culated and added to the theoretical covariance matrix 
with a very large variance. Here the covariance matrix 
of the template is given by C = (LL'), where L is the 
template vector. Using this method, we project out the 
map's monopole, dipole and the known galactic contam- 
inants, namely dust, synchrotron and the free- free emis- 
sion. For completeness we add the diagonal noise com- 
ponent Nu = cr|, although this is not strictly required 
for this analysis, because the noise power spectrum is 
< 10/iK 2 on scales of our interest. 

Hence, the total covariance matrix can be written as 

^itotal C -\- N -|- A(C^ US ^ -|- £J s y nc h £jfree— free _|_ 

C i= + C i=i ) (3) 

The value of A in the above equation must be large 
enough so that unwanted components are projected out. 
If it is too large, however, the numerical errors start to 
affect the results. 

The logarithm of likelihood of given Cgs can then be 
written as 
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(log |C| + Anog27r) 



(4) 



As mentioned in the introduction the procedure de- 
scribed above can realistically be performed only on mod- 
estly sized maps. We decrease the resolution of a given 
map using the following procedure: Firstly, the full res- 
olution source map is multiplied by the mask, whereby 
every masked pixel is zeroed, while unmasked remaines 
the same. The map is then smoothed by the 5° FWHM 
Gaussian beam and resampled at a lower Healpix [2l| 
resolution (nside=16), giving 3072 roughly independent 
pixels on a full-sky map. The mask itself is smoothed 
in the same manner and this gives us information by 
how much the smoothed pixels that were affected by the 
mask need to be up-scaled. We do not use pixels whose 
smoothed mask value drops below 0.7. We use this infor- 
mation to reduce the effective scale of smoothing beam 
(by square root of this correction) in the calculation of 
covariance matrix, although we verified that this does not 
affect any of the final results. 

We have also attempted an exact calculation of the 
window function treating each subpixel of a low resolu- 
tion Healpix map separately. Unfortunately, this is com- 
putationally prohibitively expensive. Instead, we have 
performed weighted averaging within each low-resolution 
Healpix pixel using the effective Healpix window func- 
tion provided with the package and get compatible re- 
sults. We chose not to adopt this approach for the main 
analysis since the individual Healpix pixel windows are 
anisotropic, depend on the mask and are very slowly 
dropping off with I. For our resolution level the effec- 
tive windows (which are only valid for full sky coverage) 
are only given up to £ = 64 and there is still a lot of 
power beyond that. 

The gaussian smoothing procedure is used on WMAP 
integrated maps for V and W channels and for the MEM 
maps for the three major foregrounds: Dust, Synchrotron 
and Free-Free emission. In all cases, the low resolution 
maps were produced for the KP2 mask and for the more 
conservative KPO mask. We also applied the same proce- 
dure to the ILC map, except that in this case we smooth 
over the whole map and so do not need to upscale the 
pixel values by the effect of the mask. By changing vari- 
ous parameters of the inversion process we get consistent 
likelihoods and we estimate the uncertainty in likelihood 
evaluation to be about 0.2 in logarithm of the likelihood. 
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III. MULTIPOLE MOMENTS AND THEIR 
STATISTICAL SIGNIFICANCE 

In figure^we show the maximum likelihood (ML) val- 
ues of the multipoles up to I — 18 for several of our 
basic cases. One can see from this figure that most of 
the estimates up to I ~ 10 are above the PCL values 
given by WMAP team, while at higher multipoles the 
two agree well {I = 11 appears anomalous and PCL gives 
a much higher value than the exact likelihood analysis). 
Some of the differences are due to random fluctuations: 
KP2 mask contains 85% of the sky compared to 75% 
for KPO and this can lead to differences in the two es- 
timates. Similarly, projecting out the foreground tem- 
plates reduces the amount of information, so there can 
be statistical differences between our analysis and the one 
without marginalization. While the differences between 
KPO and KP2 masks are likely to be within the allowed 
range of statistical fluctuations, this is less likely for the 
differences between WMAP-PCL and our analysis of V 
with KP2 mask, since the same mask and channel have 
been used by WMAP. The difference is partly due to 
the use of the exact likelihood analysis and partly due 
to the foreground marginalization. While WMAP team 
marginalized over monopole and dipole, they subtract 
out the foregrounds with the maximal amplitude, which 
may have removed some of the true cosmological signal 
and pushed the values lower. To eliminate the bias that 
can arise from this procedure it is best to exclude the 
information in the signal that correlates with templates 
and with monopole/dipole. This reduces the statistical 
power, but is guaranteed to be unbiased. 

To investigate further the robustness of our results fig- 
ure |21 shows the most likely values of power spectra (up 
to multipole £ = 10) for various combinations of template 
choice for W channel data and for the ILC map. This re- 
alistically indicates potential systematic differences aris- 
ing due to choice of templates. On the same graphs we 
also show the reduction performed with Healpix win- 
dow functions (where pixels of the low-resolution map 
were only weighted averages of all corresponding pixels 
in the high-resolution map), indicating that our results 
are robust with respect to choice of window function and 
smoothing procedure. The differences between the var- 
ious cases are small compared to the difference between 
exact evaluations and WMAP values. We emphasize that 
ML values are the least robust part of the analysis and 
it is much more important that the probability distribu- 
tions away from the peak are consistent. For quadrupole 
we discuss this below, while the overall impact on the 
cosmological parameter estimation is discussed in next 
section. 

Marginalisation procedure is guaranteed to give un- 
biased results independent of the form of the template 
or its nongaussian properties. The only assumption is 
that the template is not correlated with the true CMB, 
which could happen if the templates are produced from 
the CMB data itself and were affected by noise, calibra- 



tion or beam uncertainties. It is unlikely that this would 
happen on large scales. We have tested this hypothesis by 
using the extrenal templates instead of MEM templates, 
without finding much differences in the result (figure |2J). 

Figure |3| shows the probability distribution for the 
value of C2 for several cases, assuming best fit ACDM 
model for other Qs. Particular choice of other Qs affects 
the inferred curves, although at a level below the vari- 
ance between various curves. It has several interesting 
features. Firstly, when all marginalizations are used, the 
V channel, W (not shown) and ILC give very consistent 
results. In the absence of marginalization and foreground 
subtraction the V and W channel maps are very affected 
by foregrounds and ML values reach up to 500/iK 2 . The 
ILC map could be affected by the foreground marginal- 
ization; its value drops from ~ 220/iK 2 (consistent with 
H2) to ~ 170/iK 2 when projection is included in analy- 
sis. The ILC map may suffer even more from the resid- 
ual monopole / dipole contamination, which pushes the 
quadrupole value up. 

While our procedure of marginalising over 3 templates 
is the most conservative, one may worry that it is un- 
necessary. Some of the channels are not really strongly 
contaminated by all 3 components and if frequency scal- 
ing is known then multi-frequency information can be 
used to constrain a given component in a given channel. 
While there is nothing wrong with our procedure one 
could argue that it reduces the amount of information. 
The number of eliminated modes is roughly given by the 
number of templates used, but since the templates are 
correlated (being all dominated by our galaxy) the in- 
formation loss from large scale modes is likely to be less 
than 3. It is also not clear how the templates couple to 
different multipoles. To test these effects we perform the 
analysis in W channel, where foreground contamination 
is dominated by dust. We use marginalization only over 
SFD dust template (subtracting out the free-free compo- 
nent and doing nothing for synchrotron). We find this 
has very little effect on the maximum likelihood values 
of multipoles, as shown in figure [3 For t — 2 we find 
ML value at 220/iK 2 , slightly higher than in other cases 
(figure^), but the overall probability distribution is very 
similar to other cases. The effect of this procedure on 
the parameter estimation is explored in the next section. 

It is interesting to asses the statistical significance of 
the departure of the lowest multipoles from the concor- 
dant model. Our focus is not on the actual statistical 
procedure of assessing this departure (see e.g. @), but 
on the effect of estimator induced variance. We consider 
5 cases: all possible combinations of the choice of mask 
(KP2 or KPO) and frequency (V channel or W channel) 
and the official WMAP likelihood code HE^. The in- 
ferred maximum likelihood values (figure lie in the 
range 140/iK 2 — 220/fK 2 , but the likelihood function is 
broad at the peak and the exact value of the maximum 
likelihood estimate is driven by small details in the anal- 
ysis: in all of our basic cases the likelihood is within 
10-20% of the peak value over the range (120-250)/^K 2 . 
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FIG. 1: This figure shows the maximum likelihood power 
spectrum for several combinations of frequencies and masks. 
Note that all spectra agree reasonably well beyond l=\\. 
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FIG. 2: This figure shows the maximum likelihood power 
spectrum up to £ — 10 for several test cases. The derived 
features in the most likely values of power spectrum are robust 
with respect to choice of window function (gaussian versus 
healpix), templates (MEM versus external, dust only versus 
standard 3 templates in W) and maps (W versus ILC). 



Thus our results are consistent both with the original 
WMAP value (123/zK 2 ) and the values in .12J and there 
is no "correct" value given the level of foreground con- 
tamination. 

As we argued in introduction, the precise value of 
maximum likelihood estimator is not of primary inter- 
est, given that it can be strongly affected by the details 
of the analysis. Much more important for the question 
of parameter estimation is the shape of the likelihood 
function. Figures 13141 shows that while the maximum 
likelihood value of the quadrupole is quite uncertain, all 
of our cases give very similar shapes of the likelihood 
function. This likelihood distribution is not consistent 
with the likelihood provided by WMAP team, which ap- 
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FIG. 3: This figure shows the probability distributions for 
the value of C2 as inferred for the various combinations of the 
monopole/dipole and the foregrounds marginalization. Also 
shown is the official WMAP likelihood code output. Note 
that C2 in V without monopole/dipole or the foregrounds 
marginalization is heavily contaminated and gives very high 
ML values, while all the other cases have very similar proba- 
bility distributions (except WMAP code). 



pears to underestimate the errors associated with the 
galactic cut and marginalizations. The WMAP likeli- 
hood of the concordant model C2 (~ 1200^K 2 ) is roughly 
2.5 times too low with respect to the most likely point 
when compared to our likelihood values. This change in 
the shape of the likelihood function affects the parame- 
ter estimation, particularly the running of the spectral 
index, as shown in section IIVI We note here that not 
performing the marginalization over foregrounds and/or 
monopole/dipole would lead to an even higher probabil- 
ity of concordance model compared to low C2 models (see 
the corresponding probability distributions in figure [3J), 
but these are more likely to be contaminated and should 
not be used in the likelihood analysis. 

Figure shows the integrated probability as a function 
of the true value of the quadrupole (integrated from large 
values downward), under the assumption that the prior 
distribution of quadrupole values is uniform between 
and 2000/iK 2 . This prior is is adopted due to the fact 
that the concordance value of quadrupole is ~ lOOO^/K 2 
(see e.g. 0). This gives the probability of the true value 
exceeding C2 assuming this prior. We find that this prob- 
ability is around 10%, as opposed to 4% by the WMAP 
likelihood analysis. Thus with uniform prior on values 
of C2 the probability of the true quadrupole to be above 
that predicted by the concordance model is not particu- 
larly small. It becomes even larger if the upper limit at 
2000/iK 2 is removed, in which case we find 18% proba- 
bility of the true value exceeding the concordance value. 

Note that WMAP team chose to give the statistical 
significance of the low quadrupole in terms of number 
of random realizations of theoretical models in Monte 
Carlo Markov Chains (MCMC) for which the extracted 
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quadrupole is lower than the observed value of 123/iK 2 . 
This is a frequentist statistic which cannot be directly 
compared to the one we defined here in the context of 
Bayesian statistics. The frequentist approach leads to 
lower numbers (less than 1%, compared to 4% above) for 
the specific value of the quadrupole obtained by WMAP, 
but the probability is likely to be higher if our analysis 
procedure was applied to the data given our broader like- 
lihoods and higher values of the best fitted quadrupole. 
The WMAP analysis does not include the uncertainties 
in the foreground subtractions, which should have an im- 
portant effect given the skewed nature of the probability 
distributions: if an error estimate of 50/xK 2 on C2 were 
added to the measured value it would lead to an increased 
probability of concordance model. In order to truly de- 
couple the cosmic variance uncertainty for the errors aris- 
ing from the galactic cut and foregrounds one would need 
to infer the probability distribution for a particular real- 
isation of a' im s. For a full-sky CMB observation with no 
galactic contamination, this would be a delta function; 
galactic cut and large scale contamination would spread 
the probability over a finite region. This distribution, 
marginalised to produce p{{o^ m ) m ) would be the correct 
quantity that must be compared to the concordant value 
and the corresponding cosmic variance. Work on this 
front is currently in progress. 

While the frequentist approach does allow one to test a 
model (or a class of models) independent of other models, 
it is still not free of assumptions. Testing the quadrupole 
on its own only makes sense if we believe that there is 
something special about it, for example because it is sen- 
sitive to the physics on the largest scales, which may not 
be probed by lower multipoles. If it is not viewed as spe- 
cial, but only one of the many estimated multipoles, then 
the probability of one of them being this low is signifi- 
cantly higher. This is tested in the frequentist approach 
with the goodness of fit (x 2 ), which for WMAP does not 
reveal any particular anomalies. Unfortunately there is 
no hope to resolve these statistical questions completely 
with only one observed sky. 

In figure we plot the contour plots of parameters on 
the C2-C3 plane for the considered models. This shows 
that the likelihoods between C2 and C3 are only weakly 
correlated, both for exact likelihood evaluation as well 
as for the PCL approximation. In original analysis there 
was some evidence for both C2 and C3 being low, so that 
the overall significance was between 2 — 3c (figure |BJl. 
The evidence for discrepancy weakens below 2<r with our 
analysis and is consistent among the four cases. 

WMAP team presented further evidence of unusual na- 
ture of large scale correlations using the correlation func- 
tion, which appears to vanish on angles above 60° Q. 
Correlation functions are notoriously difficult to inter- 
pret due to the correlated nature of the values at differ- 
ent angles, so one must be careful not to over-interpret 
such results. In figureElwe show the correlation function 
analysis for these cases, compared to the original WMAP 
analysis and to theoretical predictions of KCDM model. 
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FIG. 4: This figure shows the probability distributions for 
the value of C2 as inferred for the various combinations of the 
selected channel and mask and the official WMAP likelihood 
code. The upper panel shows the normalised probability dis- 
tribution, while the lower panel shows the probability relative 
to the most likely point. Note that the lower panel's vertical 
axis is logarithmic. Values of other Cis were set to those of 
the best fit ACDM model. 
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FIG. 5: Cumulative probability as a function of the true value 
of the quadrupole (integrated from large values downwards 
assuming < C 2 < 2000^K 2 ). 
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FIG. 6: . In this figure we show the probability distribution 
function on the C2-C3 plane for all considered possibilities and 
the original WMAP likelihood code. Contours correspond to 
the one, two, three and four sigma assuming top-hat priors on 
the plotted limits (0 < C 2 < 1500^K 2 ,300 < C 2 < 1800/xK 2 ). 
The dashed circles correspond to the approximate values of 
the concordant model. 

We also show the result for the KCDM model where C2 
has been lowered to 150^/K 2 , keeping the other multi- 
poles unchanged. Several features are apparent from this 
figure. First, theoretical predictions for large scale corre- 
lation function are largely driven by the quadrupole and 
lowering its value to 150/iK 2 brings the correlation func- 
tion into a significantly better agreement with the ob- 
servations than the unmodified KCDM model. Second, 
our results significantly modify the predicted correlation 
function and the deviations from zero on large angles are 
now much more evident, both in the positive direction 
and in the negative direction at very large angles. To 
investigate it further WMAP team introduced a statis- 
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FIG. 7: This figure shows the autocorrelation function for all 
considered cases, the AC DM model favoured by the WMAP 
data and the same model with C2 set to 150/iK 2 .. 

tic S = J, [C(0)] 2 dcos 9. This is a posteriori statistic 
that was designed to maximise the effect, so its statisti- 
cal significance is difficult to evaluate. We find that its 
value increases from 1691 for WMAP analysis to 4197 
(W KPO), 5423 (V KPO), 9086 (V KP2), 7698 (W KP2) 
and 5832 (W KP2, dust marginalization only). While its 
value for standard KCDM model is 49625, reducing the 
quadrupole to 150>K 2 changes this to 8178, below the 
value we find in the case of V KP2. We conclude that 
there is no obvious anomaly in the correlation function 
beyond the fact that the quadrupole is low and there is 
no evidence of the correlation function vanishing on large 
angles. 



IV. PARAMETER ESTIMATION 

In order to asses the effect of the exact likelihood eval- 
uations on the inferred cosmological parameters we have 
run the Markov Chain Monte Carlo parameter estima- 
tions using the original WMAP likelihood code and the 
code modified to use the exact calculations at lowest mul- 
tipoles. The total likelihood was calculated by evaluat- 
ing the likelihood for the £ < 12 multipoles using the 
exact matrix inversion and adding the likelihood eval- 
uated from the remaining multipoles using the WMAP 
likelihood code. The power spectrum values for the mul- 
tipoles I > 12 in the exact likelihood code were kept 
at the WMAP PCL most likely model when calculat- 
ing the covariance matrix: this ensures that the like- 
lihood is not "accounted for" twice. We also neglect 
the anti-correlation between £ = 12 and I = 13 modes 
at the boundary. The evaluation of the exact likeli- 
hood typically takes around a few seconds on a mod- 
ern workstation and this is less than the time it takes to 
evaluate a theoretical CMB power spectrum with CMB- 
FAST 23]. Therefore, using the exact likelihood code 
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does not slow down the MCMC parameter estimation 
significantly. Each of the chains described below con- 
tains 100,000-200,000 chain elements, the success rate 
was of order 30-60%, correlation length 10-30 and the 
effective chain length of order 5,000-15,000. We use 8-24 
chains and in terms of Gelman and Rubin i?-statistics [24| 
we find the chains are sufficiently converged and mixed, 
with R < 1.01, compared to recommended value R < 1.2 
or more conservative value R < 1.1 adopted by WMAP 
team 

The likelihood also uses the information contained 
in the polarization-temperature (TE) cross-correlation 
power spectrum using the official WMAP likelihood code, 
which uses similar approximations as temperature power 
spectrum and completely ignores correlations between 
TT and TE power spectra. We cannot yet use the exact 
evaluations since the polarization maps are not publicly 
available at this time. 

We ran several MCMCs using a custom developed soft- 
ware described in (25|. We consider only flat models. We 
begin with the simplest 5-parameter models 

p = (r, U! b , w cdm , TZ, Q m ), (5) 

where r is the optical depth, toi, ~ f2f,/i 2 is proportional 
to the baryon to photon density ratio, w c dm = H cc i m /i 2 
is proportional to the cold dark matter to photon den- 
sity ratio, fl m = ficdm + = 1 — Q\ is the matter 
density today and 1Z is the amplitude of curvature per- 
turbations at k — 0.05/Mpc (we replace this parame- 
ter with as in table 1). To reduce the degeneracies we 
use w;,, w c d m , angular diameter distance 8 S , In 1Z and 
In 1Z — t — 0.51og(oj/, + w cl j m ) instead of parameters in 
equation [SI adopting broad flat priors on them. Most of 
these priors are not important because the parameters 
are well determined. The exception is optical depth, for 
which we additionaly apply r < 0.3 on some of MCMCs 
following WMAP team. 

The simple 5-parameter model is sufficient to obtain 
a good fit to the WMAP data. We add CBI+ACBAR 
to the WMAP data [U H3 and follow WMAP team 
in denoting this dataset as WMAPext. Second set of 
MCMCs we ran was also based on WMAPext data, but 
with an expanded set of parameters which include pri- 
mordial slope n s , its running a s — dn s /dhik and ten- 
sors (parametrised with r = T/S), adopting flat priors 
on these parameters. Adding these 3 parameters only im- 
proves x 2 by 5, so they are not really needed to improve 
the fit to the data. Because of this we find significant 
degeneracies among many of the parameters. The best 
fitted values are not necessarily very meaningful and they 
could be significantly influenced by the assumed priors, 
but we can still compare the changes between the new 
and original analysis. Third set of MCMCs was based 
on the combined WMAPext+SDSS analysis [2j|, which 
breaks some of these degeneracies. Last set of MCMCs 
was based on WMAP+VSA 29], both with and without 
SDSS. We remove r < 0.3 constraint for this case. The 
results are shown in tables 1-2. 



A. Matter density 

In 5-parameter chains Q m is the parameter that 
changes most by the new analysis. Its probability distri- 
bution from various MCMCs is shown in figure [H] This 
parameter is not well determined from the CMB data, 
since it only weakly affects the positions of acoustic peaks 
in a flat universe. This leaves the integrated Sachs- Wolfe 
effect on large scales as an important way to constrain 
f2 m : reducing f2 m leads to a decay in the gravitational 
potential, which increases the contribution to the large 
scale anisotropies from the line of sight integration of the 
time derivative of gravitational potential. Increase of the 
low multipoles by our analysis (figure |HJ thus requires a 
lower value of Q m to fit the data. This is more promi- 
nent for KP2, where the best fit value is Q m = 0.24+QQ5, 
than KP0 which gives O ro = 0.26lgQg, but the latter 
contains less area and its error distribution is slightly 
broader. Lower fi m values are also preferred in the joint 
WMAPext+SDSS analysis, but here the SDSS data tend 
to push the overall value up to fl m = 0.27^0 03- ^ n these 
8 parameter chains the WMAP \ 2 is higher by about 5 
compared to the WMAP without SDSS. Thus there is a 
bit of a tension between the SDSS data, favouring high 
f2 m and the WMAP data favouring low values of this 
parameter, although the statistical significance of this 
tension is low. For low f2 OT = 0.24 the Hubble param- 
eter is h — 0.75, still in agreement with the HST key 
project value of h = 0.72 ± 0.08 [3(j ■ If we eliminate 
tensors from the analysis then we find f2 m = 0.301q'q^ 
for WMAP+SDSS+VSA combination of the data. The 
overall conclusion is that values of O m between 0.2 to 0.4 
remain acceptable by the data and that the actual value 
depends strongly on the choice of parameter space. 

B. Running 

Running has attracted a lot of attention ever since 
WMAP team argued for a 2-cr evidence of negative run- 
ning. When analyzing CMB data alone one finds that 
running is strongly correlated with the optical depth r. 
Figure EH shows an example of this in WMAP+VSA 
MCMCs. We see that this particular combination of 
data prefers r > 0.3 and that such a high value of op- 
tical depth requires large negative running. A similar 
effect has been noticed in WMAP+CBI analysis [3lJ and 
WMAP+VSA analysis 33- We find that the statistical 
significance of running is strongly affected by the adopted 
prior on r. In fact, when prior on r is relaxed, the one- 
dimensional marginalised probability distribution seem 
to prefer models with high values of r and large nega- 
tive running. However, we note that this is the result 
of the large posterior probability volume in this region, 
rather than a better fit to the data. Moreover, such high 
values of optical depth are difficult to reconcile with the 
hierarchical models of structure formation and would re- 
quire a lot of small scale power, contrary to the effect of a 
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FIG. 8: Probability distribution p(f2 m ) and its cumu- 
lative value J^ m p(Q' m )dQ' m for 5-parameter MCMCs of 
WMAPext data (bottom) and for 8-parameter MCMCs of 
WMAPext+SDSS data (top). We present V frequency map 
and both KPO and KP2 mask results for the full likelihood 
analysis of 5-parameters MCMCs of WMAPext data and V 
KP2 for full likelihood analysis of 8-parameter MCMCs of 
WMAPext+SDSS data. Also shown for comparison are the 
results using regular (old) WMAP analysis routine. 



0.4 



0.2 




FIG. 9: Probability distribution p(a s ) and its cumula- 
tive value J_ s p(a' g )da' s for old and new MCMCs using 
WMAPext+SDSS data. We use V frequency map and KP2 
mask in the full likelihood analysis. 



negative running. Even more importantly, a high optical 
depth would lead to a large signal in WMAP EE polar- 
ization spectrum. To eliminate this region of parameter 
space WMAP team adopted a prior r < 0.3 and we fol- 
low that for most of our MCMCs. However, one can also 
eliminate this region of parameter space by adding the 
SDSS data, which do not favor the high optical depth 
values (ngure rrU|) and we give an example of this in table 
2. 

In this paper we are more interested in how running 
changes if we use the exact likelihood routine as opposed 
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FIG. 10: Two dimensional contours of 68% and 95% prob- 
ability in (a s ,r) and (a a ,z r i) plane from WMAP+VSA and 
WMAP+VSA+SDSS data. 



to the approximate one. The resulting values of the run- 
ning for various cases are given in tables 1-2. They are 
significantly affected by the exact likelihood calculations. 
This is expected from the analysis presented in previ- 
ous section, where we have shown that the exact likeli- 
hood analysis with foreground marginalization leads to 
an enhancement of low £ multipoles and broadens the 
shape of the likelihood distribution for quadrupole to al- 
low a higher likelihood for models with less negative run- 
ning. Figure shows the MCMC generated probability 
distributions for running a s using WMAPext+SDSS in 
8-parameter models. Note that there is a strong corre- 
lation between running and tensors in such a way that 
for no tensors there is less evidence for running [25{, So 
some of the evidence for running in the 8-parameter anal- 
ysis (and in is driven simply by the large parameter 
space of r > models and should not be taken as an 
evidence of running on its own. Even so we find that the 
evidence for running, marginally suggested by the old 
analysis, largely goes away in the new analysis and the 
value of running changes from -0.060 to -0.015 (V KP2, 
full marginalization) or -0.032 (W KP2, dust marginal- 
ization only), with an error of 0.035. This confirms that 
the suggested evidence for running relies crucially on low 
quadrupole and octupole |3J|, for which the statistical 
analysis and foreground removal are least reliable. 

This point was also noted in the recent analysis of 
WMAP+VSA data H3, where the WMAP likelihood 
code was used and evidence in excess of 2-er for run- 
ning was found, while removing £ < 10 information re- 
duced this evidence to less than 1-er. While one should 
not simply remove the entire £ < 10 information one 
should use the exact calculations instead of approxi- 
mate ones if the answer depends on it. Our results 
for WMAP+SDSS+VSA analysis for 7-parameter mod- 
els without tensors given in table 2 show that run- 
ning is strongly suppressed with the new analysis, a s = 
— 0.022lg Q3 2 , even without adopting any prior on the 
optical depth. 
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FIG. 11: Two dimensional contours of 68% and 95% prob- 
ability in (a s ,n s ) plane for old and new MCMCs using 
WMAPext+SDSS data. We use V frequency map and KP2 
mask in the full likelihood analysis. 



As shown in table 1 the best fitted value of the primor- 
dial slope n s increases appreciably as well, although this 
is mostly a consequence of the change in running. This is 
clarified in figure ITTI which shows old and new contours 
in (n s , a s ) plane. There is some degeneracy between the 
two parameters, so that models with low values of run- 
ning also require low slope. Since low values of running 
are excluded by the new analysis this implies that low 
values of the primordial slope are also excluded, pushing 
the average slope up. 



V. CONCLUSIONS 

In this paper we have developed routines to calculate 
the exact likelihood of the low resolution WMAP data. 
We have projected out unwanted foreground components 
by adding the foreground templates to our covariance 
matrix with large variance. Both of these methods have 
not been applied to WMAP data before and should 
improve upon the existing analyses. We have tested 
the robustness of our results by applying the method 
to many different combinations of observing frequency, 
mask, smoothing and templates and found consistent re- 
sults among these various cases. In particular, we find 
consistent results if we marginalize only over dust in 
W channel as opposed to all 3 foreground templates, if 
we use templates external to WMAP instead of WMAP 
MEM templates, if we use KP0 instead of KP2 mask, if 
we use ILC maps instead of individual V or W frequencies 
or if we use Healpix windows instead of gaussian smooth- 
ing. The two most important features of our procedure 
are thus marginalization over dust and exact likelihood 
analysis. 

Important differences exist between our results and 
previous work. We find higher values of the lowest mul- 
tipoles, which is partly a consequence of template sub- 
traction method used in WMAP analysis. This proce- 



dure would certainly remove some of the real power, al- 
though it is difficult to estimate how much and the dif- 
ferences could also be just a statistical fluctuation. For 
the maximum likelihood value of the quadrupole we find 
values between the original WMAP analysis and subse- 
quent reanalysis by 12] . The differences are within the 
estimated error of the foreground contamination and we 
argue that the actual value is not very reliable given how 
broad the likelihood is at the peak. More important is 
the shape of the likelihood function, which we find to 
be broader than in the WMAP team provided likelihood 
evaluation, which underestimates the errors compared to 
our analysis. This lowers the statistical significance of 
the departure of the data from the concordant model. 
Within a Bayesian context and assuming a flat prior on 
the distribution of quadrupoles we find the probability 
that a model exceeds the concordance model predicted 
quadrupole to be 10%. We also do not find anything 
particularly unusual in the correlation function and in 
the joint quadrupole-octopole analysis. 

We combine the full likelihood calculation with fore- 
ground marginalization at low £ with the original WMAP 
PCL analysis at high £ to generate Monte Carlo Markov 
Chains, whose distribution converges to the probability 
distribution of theoretical models given the data and as- 
sumed priors. The main effect of the new analysis is on 
the running of the spectral index, for which the marginal 
2 sigma evidence for a s < present in the original anal- 
ysis and in the recent analysis of WMAP+VSA (see 
also [3lJ ) is reduced to below 1 sigma. Using the exact 
WMAP likelihood analysis will be essential for attempts 
to determine the running of the spectral index by com- 
bining WMAP with either the small scale CMB data or 
with the upcoming Ly-a forest analysis from SDSS. In 
all of these cases the exact method increases the value of 
the running by pushing up the CMB spectrum at large 
scales. Another parameter which is significantly affected 
is the matter density £l m or, equivalently, the dark en- 
ergy density f^A. We find Q m to be reduced by the new 
analysis because of the added power at low multipoles, 
which is most easily accounted for by an increase in ISW 
contribution. 

We have shown that the effects of the improved like- 
lihood analysis presented here can be significant for the 
determination of cosmological parameters. We expect 
the methods applied here will be equally important for 
the analysis of polarization data in WMAP, where the 
foregrounds play a much more important role and where 
a full likelihood analysis of joint temperature and po- 
larization data is necessary to extract the maximum 
amount of information. Current analysis of temperature- 
polarization data is rather unsatisfactory, since it is based 
on the cross-spectrum information alone. Without hav- 
ing access to the full polarization maps we cannot im- 
prove upon it here. Thus the results shown in tables 1-2 
should still be viewed as preliminary regarding the opti- 
cal depth, which is essentially determined by the polar- 
ization data. Upcoming WMAP 2-year analysis/release 
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Tabic 1: median value, la and 2a constraints on cosmological parameters for various MCMCs based on WMAP data alone. 5p denotes 
varying 5 basic cosmological parameters in MCMCs, while 8p stands for 8 parameter chains. Old stands for the evaluation of the WMAP 
likelihood using the current WMAP provided software, VKP2 is our new exact likelihood evaluation analysis of V maps using KP2 mask 

and VKPO is the same for KPO mask. 





5p old 


5p VKP2 


5p VKPO 


8p old 


8p VKP2 


W 2 U! b 


z - w -0.06 -0.13 


z -°°-0.07 -0.13 


4_n _i_n i ^ 
Z ' Oy -0.06 -0.13 


Z -°'-0.16 -0.32 


in 1 q j_n 
z -^ y -0.17 -0.34 




~ ~~^0 08 ^0 16 
u - za -0.06 -0.11 


^ ^ a 4-0 07 4-0 1 R 
u -^-0.05 -0.10 


^ ,-.✓■> 4-0 07 4-0 1 fi 
r\ OCT"'"' -|-u.-lu 

u - zo -0.06 -0.11 


^ ~^4-0 07 4-0 1 fi 
u - zu -0.06 -0.10 


^ - - 4-0 Ofi 4-0 1 3 
u - lo -0.04 -0.07 


^cdm 


n i 9 +0.017 +0.03 
u - iz -0.017 -0.03 


n i i +0.016 +0.03 
U - 1J --0.016 -0.03 


n i i +0.017 +0.03 
U - ±J --0.016 -0.03 


n 1n +0.0l7 +0.03 
u - lu -0.017 -0.03 


n nQ +0.016 +0.03 
u - ua -0.015 -0.03 


T 


n. -i 7 +0.04 +0.08 
u - ± -0.04 -0.09 


n 91 +0.04 +0.07 
u -0.04 -0.08 


n 1Q+ 04 +0.08 
Uil -0.04 -0.08 


pv 9Q+0.05 +0.07 
u - z -0.08 -0.16 


n 9/1 +0.05 +0.06 
u -^-0.08 -0.17 




94 +0.07 +0.13 
-0.08 -0.17 


n qn+ - 08 + - 15 
-0 .09 -0.19 


n Q9+O.O8 +0.15 
u ' a -0.09 -0.19 


n oi+0.12 +0.25 
u- ° -0.13 -0.26 


n 7C+0.13 +0.24 
Ui ' -0.13 -0.25 


h 


n 79 +0.05 +0.10 
u -' z -0.05 -0.08 


n 7c .+0.05 +0.11 
u -'°~0.05 -0.09 


n 7 o+0.05 +0.11 
u - ' -0.05 -0.09 


n 70+0. 08 +0.19 
u -'°-0.07 -0.13 


n 07+O.O9 +0.19 
u -°' -0.08 -0.15 


T/S 











< 0.76 (95%) 


< 0.81 (95%) 


n a 


1 


1 


1 


n nc+0.07 +0.14 
u - a -0.07 -0.15 


, n9 +0.07 +0.15 
u -0.07 -0.15 













n ns +0.05 +0.10 
u - uo -0.06 -0.13 


n n4 +0.05 +0.10 
UiU -0.06 -0.13 



Table 2: Same as Table 1 for WMAP+SDSS (8-parameter MCMCs with regular (old) or corrected (exact likelihood) analysis). The new 
analysis uses V KP2 with full marginalization and W KP2 with dust marginalization only. We also give WMAP+SDSS+VSA 

(7-parameters). For the latter case we do not impose r < 0.3. 





8p SDSS+old 


8p SDSS+VKP2 


8p SDSS+WKP2 


7p SDSS+VSA+VKP2 


10 2 c^ 

u cdm 
T 

0"8 
h 

T/S 

n s 
a a 


9 4n +0.16 +0.32 
z -^ u -0.16 -0.30 

n 01 +0.06 +0.13 
u - Ji -0.05 -0.08 

n 1 9O+0.009 +0.019 
u.izo_ 00g _ 016 

n 9n +0.07 +0.09 
u -^ u -0.08 -0.14 

n QS+ ' 08 + ' 16 
u - ao -0.09 -0.16 

n 7n+ - 05 +0-09 

u -' u -0.05 -0.09 

< 0.46 (95%) 

n 07+O.O6 +0.11 
u - a '-0.06 -0.12 

n nfin+ 038 +0-074 

u.uuu_q 039 _o.Q83 


9 /10+O.I6 +0.30 
z -^°-0.16 -0.31 

n 97+O.O5 +0.11 
u - z '-0.03 -0.06 

n 1 91 +0.008 +0.017 
u.izi_ 007 _ 014 

n 9 n +0.07 +0.09 
u -^ u -0.08 -0.14 

n Q7+0.09 +0.16 
-0.09 -0.16 

n 70+O.O4 +0.08 
u -'°-0.04 -0.09 

< 0.46 (95%) 

, m +0.05 +0.10 
i - ui -0.06 -0.11 

n ni k+0.036 +0.072 
U.U10_ () 037 _ .080 


9 -7+O.I6 +0.31 
-0.16 -0.30 

n 90+0.05 +0.11 
UiZO -0.04 -0.07 

n 190+0. 008 +0.017 
u.rzo_ 007 _ 014 

n 9n +0.07 +0.09 
u u -0.08 -0.14 

n q 7 +0.09 +0.16 
u - al -0.09 -0.16 

n 70+O.O4 +0.08 
"■'■ 3 -0.04 -0.09 

< 0.47 (95%) 

1 n 9+0.05 +0.10 
-"-■ u -0.06 -0.11 

n nQO+ - 036 +0.072 
U.UOZ_ () Q3g _Q.080 


9 o A +0.18 +0.52 
Z - J -0.15 -0.28 

n o n +0.06 +0.12 
u - ou -0.05 -0.10 

n 1 90+0. 008 +0.017 
u - ±z -0.008 -0.018 

n iQ+0.11 +0.26 
u - ±a -0.08 -0.13 

n qo+0.12 +0.29 
u - a -0.08 -0.13 

n 7n +0.05 +0.14 
u -' u -0.05 -0.08 



n 07+O.O6 +0.16 
U - M '-0.06 -0.11 

n n90+ 034 +0.069 
u - uzz -0.032 -0.062 



of polarization data should elucidate the current situa- 
tion. The code developed here will be made available to 
the community at cosmas.org. 
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